Top Clients v2 / Top Clients
M
Data
Structure
INFORMATION_SCHEMA
PUBLIC
Cells
Clients Table
Schema of Columns
Check for Duplicates in Clients Table
Nulls in district_ID Column in Clients Table
Verification that NCES_District_ID has only Numerical Values
District ID, Client_Add_Date, and Client_Cancel_Date
Former Clients Active Account Time
Clients with Client_Cancel_Date but no Client_Add_Date
Initial Join Check Between Clients and Transactions to Look For Missing Values
Possible Data Errors in Clients Table
Nulls in City or State Column in Clients Table
Number of Distinct States
Nulls in District_Type Column of Clients Table
Distribution of District Types in Clients Table
Nulls in Lowest_grade, Highest_grade, and Level Columns in Clients Table
Frequency of Lowest_Grades in Clients Table
Frequency of Highest_Grades in Clients Table
Frequency of Levels in Clients Table
Overview of District_Type in Clients Table
Possible Instances of Incorrect Data
Frequency of Number of Schools Per District
Districts With No Schools
Number of Nulls in Status Column
Number of District_IDs per Status Category (All WERE OPEN)
Clients Table
Schema of Columns
Check for Duplicates in Clients Table
Nulls in district_ID Column in Clients Table
Verification that NCES_District_ID has only Numerical Values
District ID, Client_Add_Date, and Client_Cancel_Date
Former Clients Active Account Time
Clients with Client_Cancel_Date but no Client_Add_Date
Initial Join Check Between Clients and Transactions to Look For Missing Values
Possible Data Errors in Clients Table
Nulls in City or State Column in Clients Table
Number of Distinct States
Nulls in District_Type Column of Clients Table
Distribution of District Types in Clients Table
Nulls in Lowest_grade, Highest_grade, and Level Columns in Clients Table
Frequency of Lowest_Grades in Clients Table
Frequency of Highest_Grades in Clients Table
Frequency of Levels in Clients Table
Overview of District_Type in Clients Table
Possible Instances of Incorrect Data
Frequency of Number of Schools Per District
Districts With No Schools
Number of Nulls in Status Column
Number of District_IDs per Status Category (All WERE OPEN)
Number of Null Values in charter_district Column
Number of Charter Vs Non-Charter Districts
Transactions Table
Number of Rows in Transactions Table
Number of Unique District Ids in Transactions Table
Number of Transactions by District ID from Transactions Table
Enrollment Table
Total Rows in Enrollment Table
Total Unique District IDs in Enrollment Table
Change Report Year and School Year Format in Enrollment Table
Partially Revised Enrollment Table
Enrollment_Revised Table
Finances Table
Total Rows in Finances Table
Total Unique District_ids in Finances Table
finances table
finances_r table
finances_revised table
Client_Transactions_2020 Table
Total Rows in client_transactions_2020 Table
Total Unique District_IDs
Districts Ordered By Net Volume
client_transactions_2020
Mean Payment Volume
Median Payment Volume
Clients Categorized by Top 1%
client_transactions_2020_ranked
State and Number of Districts
State, City, Number of Districts
District Type Distribution For Top Performing Clients
Number of Schools Per District For Top Performing Clients
Charter District Status for Top 1%
Level of Top 1%
Total_Revenue of Top 1%
Total Expenses
local_revenue
Local Vs Total Revenue
District Net Income
Number of Transactions In 2020
Account Activation Table
Cancelled By State
Cancelled By District Type
Cancelled By Level
Cancelled By Number of Schools
Cancelled By Charter District (Charter of Non Charter
Top Clients
GOALS:
0
  1. Identify top clients in 2020 and what characteristics they shared.
    0
  2. What sets the top clients apart from other clients?
    0
  3. How can we identify which clients could become top clients in future years?
    0
  4. What characteristics may indicate that our clients will cancel?
    0
  5. Future Changes/Investigations
    0

Part 1. Data Cleaning + Initial Data Summary
0

0
I explicitly used SQL queries to demonstrate my data cleaning process for the Clients table. In reality, I used the order by and filter features in the interactive table and went back to write queries to describe what I was looking for. For the remaining tables, I only included SQL queries for things that I needed to fix.
0
Clients Table
Untitled

NCES_DISTRICT_ID
CLIENT_ADD_DATE
CLIENT_CANCEL_DATE
CITY
STATE
DISTRICT_TYPE
LOWEST_GRADE
HIGHEST_GRADE
LEVEL
NUMBER_OF_SCHOOLS
STATUS
CHARTER_DISTRICT
Use ⌥ + scroll to pan the table
12 columns, 1,825 rows
3.9 seconds (16 hours ago)
0
Schema of Columns
Untitled

TABLE_CATALOG
TABLE_SCHEMA
TABLE_NAME
COLUMN_NAME
DATA_TYPE
Use ⌥ + scroll to pan the table
5 columns, 348 rows
2.5 seconds (16 hours ago)
0
Check for Duplicates in Clients Table
Untitled

NCES_DISTRICT_ID
COUNT
Use ⌥ + scroll to pan the table
2 columns, 0 rows
2.5 seconds (16 hours ago)
0
Nulls in district_ID Column in Clients Table
Untitled

NCES_DISTRICT_ID
CLIENT_ADD_DATE
CLIENT_CANCEL_DATE
CITY
STATE
DISTRICT_TYPE
LOWEST_GRADE
HIGHEST_GRADE
LEVEL
NUMBER_OF_SCHOOLS
STATUS
CHARTER_DISTRICT
Use ⌥ + scroll to pan the table
12 columns, 0 rows
2.3 seconds (16 hours ago)
0
Verification that NCES_District_ID has only Numerical Values
Untitled

NCES_DISTRICT_ID
NCES_DISTRICT_ID2
Use ⌥ + scroll to pan the table
2 columns, 1,825 rows
2.4 seconds (16 hours ago)
0
District ID, Client_Add_Date, and Client_Cancel_Date
Untitled

NCES_DISTRICT_ID
CLIENT_ADD_DATE
CLIENT_CANCEL_DATE
Use ⌥ + scroll to pan the table
3 columns, 1,825 rows
0.8 seconds (17 hours ago)
0
Former Clients Active Account Time
Untitled

NCES_DISTRICT_ID
CLIENT_ADD_DATE
CLIENT_CANCEL_DATE
DAYS_DIFF
Use ⌥ + scroll to pan the table
4 columns, 286 rows
1.2 seconds (17 hours ago)
0
Clients with Client_Cancel_Date but no Client_Add_Date
Untitled

NCES_DISTRICT_ID
CLIENT_ADD_DATE
CLIENT_CANCEL_DATE
Use ⌥ + scroll to pan the table
3 columns, 26 rows
1.3 seconds (17 hours ago)
0
Initial Join Check Between Clients and Transactions to Look For Missing Values
Untitled

NCES_DISTRICT_ID
CLIENT_ADD_DATE
CLIENT_CANCEL_DATE
PAYMENT_DATE
PAYMENT_VOLUME
PAYMENT_QTY
Use ⌥ + scroll to pan the table
6 columns, 26 rows
1.5 seconds (18 hours ago)
0
Possible Data Errors in Clients Table
Untitled

NCES_DISTRICT_ID
CLIENT_ADD_DATE
CLIENT_CANCEL_DATE
PAYMENT_DATE
PAYMENT_VOLUME
PAYMENT_QTY
Use ⌥ + scroll to pan the table
6 columns, 10,000+ rows
3.5 seconds (18 hours ago)
0
Nulls in City or State Column in Clients Table
Untitled

NCES_DISTRICT_ID
CITY
STATE
Use ⌥ + scroll to pan the table
3 columns, 0 rows
2.4 seconds (16 hours ago)
0
Number of Distinct States
Untitled

STATE_COUNT_CHECK
Use ⌥ + scroll to pan the table
1 column, 1 row
2.4 seconds (16 hours ago)
0
Nulls in District_Type Column of Clients Table
Untitled

NCES_DISTRICT_ID
DISTRICT_TYPE
Use ⌥ + scroll to pan the table
2 columns, 0 rows
2.4 seconds (16 hours ago)
0
Distribution of District Types in Clients Table
Untitled

DISTRICT_TYPE
COUNT
Use ⌥ + scroll to pan the table
2 columns, 8 rows
1.7 seconds (18 hours ago)
0
Nulls in Lowest_grade, Highest_grade, and Level Columns in Clients Table
Untitled

NCES_DISTRICT_ID
LOWEST_GRADE
HIGHEST_GRADE
LEVEL
Use ⌥ + scroll to pan the table
4 columns, 0 rows
2.4 seconds (16 hours ago)
0
Frequency of Lowest_Grades in Clients Table
Untitled

LOWEST_GRADE
COUNT
Use ⌥ + scroll to pan the table
2 columns, 14 rows
2.4 seconds (16 hours ago)
0
Frequency of Highest_Grades in Clients Table
Untitled

HIGHEST_GRADE
COUNT
Use ⌥ + scroll to pan the table
2 columns, 13 rows
2.4 seconds (16 hours ago)
0
Frequency of Levels in Clients Table
Untitled

LEVEL
COUNT
Use ⌥ + scroll to pan the table
2 columns, 7 rows
2.4 seconds (16 hours ago)
0
Overview of District_Type in Clients Table
Untitled

NCES_DISTRICT_ID
DISTRICT_TYPE
LOWEST_GRADE
HIGHEST_GRADE
LEVEL
Use ⌥ + scroll to pan the table
5 columns, 16 rows
2.4 seconds (16 hours ago)
0
Possible Instances of Incorrect Data
Untitled

NCES_DISTRICT_ID
DISTRICT_TYPE
LOWEST_GRADE
HIGHEST_GRADE
LEVEL
Use ⌥ + scroll to pan the table
5 columns, 11 rows
2.4 seconds (16 hours ago)
0
Frequency of Number of Schools Per District
Frequency of Number of Schools Per District
NUMBER_OF_SCHOOLS
COUNT
Use ⌥ + scroll to pan the table
2 columns, 79 rows
2.4 seconds (16 hours ago)
0
Districts With No Schools
Districts With No Associated Schools
NCES_DISTRICT_ID
NUMBER_OF_SCHOOLS
DISTRICT_TYPE
LOWEST_GRADE
HIGHEST_GRADE
Use ⌥ + scroll to pan the table
5 columns, 12 rows
1.1 seconds (18 hours ago)
0
Number of Nulls in Status Column
Untitled

STATUS
NCES_DISTRICT_ID
Use ⌥ + scroll to pan the table
2 columns, 0 rows
2.4 seconds (16 hours ago)
0
Number of District_IDs per Status Category (All WERE OPEN)
Untitled

STATUS
COUNT
Use ⌥ + scroll to pan the table
2 columns, 1 row
2.4 seconds (16 hours ago)
0
Number of Null Values in charter_district Column
Untitled

CHARTER_DISTRICT
NCES_DISTRICT_ID
Use ⌥ + scroll to pan the table
2 columns, 0 rows
2.4 seconds (16 hours ago)
0
Number of Charter Vs Non-Charter Districts
Untitled

CHARTER_DISTRICT
COUNT
Use ⌥ + scroll to pan the table
2 columns, 2 rows
2.4 seconds (16 hours ago)
0
Transactions Table
Untitled

NCES_DISTRICT_ID
PAYMENT_DATE
PAYMENT_TYPE
PAYMENT_VOLUME
PAYMENT_QTY
_SDC_BATCHED_AT
_SDC_RECEIVED_AT
_SDC_SEQUENCE
_SDC_TABLE_VERSION
Use ⌥ + scroll to pan the table
9 columns, 10,000+ rows
4.7 seconds (16 hours ago)
0
Number of Rows in Transactions Table
Untitled

total district_ids
Use ⌥ + scroll to pan the table
1 column, 1 row
2.4 seconds (16 hours ago)
0
Number of Unique District Ids in Transactions Table
Untitled

unique district_ids
Use ⌥ + scroll to pan the table
1 column, 1 row
2.4 seconds (16 hours ago)
0
Number of Transactions by District ID from Transactions Table
Untitled

NCES_DISTRICT_ID
number of transactions
Use ⌥ + scroll to pan the table
2 columns, 1,406 rows
2 seconds (17 hours ago)
0
Number of Transactions by District ID
Use ⌥ + scroll to pan the visual
2 columns, 1,406 rows
0
Enrollment Table
Untitled

NCES_DISTRICT_ID
REPORT_YEAR
SCHOOL_YEAR
STUDENT_COUNT
LOAD_DATE
FILENAME
Use ⌥ + scroll to pan the table
6 columns, 10,000+ rows
4.2 seconds (16 hours ago)
0
Total Rows in Enrollment Table
Untitled

total district_ids
Use ⌥ + scroll to pan the table
1 column, 1 row
2.4 seconds (16 hours ago)
0
Total Unique District IDs in Enrollment Table
Untitled

unique district_ids
Use ⌥ + scroll to pan the table
1 column, 1 row
2.4 seconds (16 hours ago)
0
Change Report Year and School Year Format in Enrollment Table
Untitled

NCES_DISTRICT_ID
REPORT_YEAR_FALL
SCHOOL_YEAR_FALL
LOAD_DATE
STUDENT_COUNT
Use ⌥ + scroll to pan the table
5 columns, 10,000+ rows
4.1 seconds (16 hours ago)
0
Partially Revised Enrollment Table
Untitled

NCES_DISTRICT_ID
REPORT_YEAR_FALL
Use ⌥ + scroll to pan the table
2 columns, 0 rows
2.4 seconds (16 hours ago)
0
I was cautious about trying to look at trends over time within the enrollment (and finances) table because the most recent data was from 2018. Ultimately, I decided to just take the most recent year of data form the enrollment table and the finances table. Originally, I had planned to filter out any data from before the pandemic, but decided to temporarily settle on using the most recent (non-null) report_year.
0
*If this were an actual project for work, I would ask the team responsible for ETL to update the data from , although it looks like the most recent completed dataset is only from 2019-2020. (If I was allowed to use outside sources for this project, I would upload both datasets to Databricks and do my analysis with the updated data. In an ideal world, I would wait until the 2020-2021 data was also available so I could look for trends over time.)
0
Enrollment_Revised Table
Untitled

NCES_DISTRICT_ID
MOST_RECENT_REPORT_YEAR
STUDENT_COUNT
Use ⌥ + scroll to pan the table
3 columns, 10,000+ rows
3.4 seconds (20 hours ago)
0
Finances Table
Finances
NCES_DISTRICT_ID
REPORT_YEAR
TOTAL_REVENUE
TOTAL_EXPENSES
LOCAL_REVENUE
LOAD_DATE
FILENAME
Use ⌥ + scroll to pan the table
7 columns, 10,000+ rows
5 seconds (18 hours ago)
0
Total Rows in Finances Table
Untitled

total district_ids
Use ⌥ + scroll to pan the table
1 column, 1 row
3 seconds (18 hours ago)
0
Total Unique District_ids in Finances Table
Untitled

unique district_ids
Use ⌥ + scroll to pan the table
1 column, 1 row
2.7 seconds (18 hours ago)
0
finances table
Untitled

NCES_DISTRICT_ID
REPORT_YEAR
TOTAL_REVENUE
TOTAL_EXPENSES
LOCAL_REVENUE
LOAD_DATE
FILENAME
Use ⌥ + scroll to pan the table
7 columns, 0 rows
2.7 seconds (18 hours ago)
0
finances_r table
finances_r
NCES_DISTRICT_ID
REPORT_YEAR_FALL
TOTAL_REVENUE
TOTAL_EXPENSES
LOCAL_REVENUE
LOAD_DATE
Use ⌥ + scroll to pan the table
6 columns, 10,000+ rows
4.3 seconds (18 hours ago)
0
finances_revised table
finances_revised
NCES_DISTRICT_ID
MOST_RECENT_REPORT_YEAR
TOTAL_REVENUE
TOTAL_EXPENSES
LOCAL_REVENUE
Use ⌥ + scroll to pan the table
5 columns, 10,000+ rows
3.6 seconds (20 hours ago)
0

Part 2: ID Top Clients of 2020
0

I defined "Top Clients" based on highest net payment volume from 1/1/2020 to 12/31/2020.
0
*I was told that payment volume was closely correlated with revenue. Since I was not able to determine costs (such as set-up/maintenance costs for different sized clients) I just used net payment volume during 2020 as my KPI (Key Performance Indicator).
0
Client_Transactions_2020 Table
Untitled

NCES_DISTRICT_ID
CLIENT_ADD_DATE
CLIENT_CANCEL_DATE
CITY
STATE
DISTRICT_TYPE
LOWEST_GRADE
HIGHEST_GRADE
LEVEL
NUMBER_OF_SCHOOLS
STATUS
CHARTER_DISTRICT
PAYMENT_DATE
PAYMENT_TYPE
PAYMENT_VOLUME
PAYMENT_QTY
Use ⌥ + scroll to pan the table
16 columns, 10,000+ rows
5 seconds (16 hours ago)
0
Total Rows in client_transactions_2020 Table
Untitled

total rows
Use ⌥ + scroll to pan the table
1 column, 1 row
2.4 seconds (16 hours ago)
0
Total Unique District_IDs
Untitled

total unique district_ids
Use ⌥ + scroll to pan the table
1 column, 1 row
2.4 seconds (16 hours ago)
0
Districts Ordered By Net Volume
Untitled

NCES_DISTRICT_ID
NET_PAYMENT_VOLUME
Use ⌥ + scroll to pan the table
2 columns, 1,269 rows
0.6 seconds (18 hours ago)
0
Net Income 2020 Based on Net Volume
Use ⌥ + scroll to pan the visual
2 columns, 1,269 rows
2
client_transactions_2020
Untitled

NCES_DISTRICT_ID
NET_PAYMENT_VOLUME
RANK
CLIENT_ADD_DATE
CLIENT_CANCEL_DATE
CITY
STATE
DISTRICT_TYPE
LOWEST_GRADE
HIGHEST_GRADE
LEVEL
NUMBER_OF_SCHOOLS
STATUS
CHARTER_DISTRICT
PAYMENT_DATE
PAYMENT_TYPE
PAYMENT_QTY
Use ⌥ + scroll to pan the table
17 columns, 10,000+ rows
3.5 seconds (20 hours ago)
0
Mean Payment Volume
Untitled

AVERAGE_PAYMENT_VOLUME
Use ⌥ + scroll to pan the table
1 column, 1 row
1 second (18 hours ago)
0
Median Payment Volume
Untitled

MEDIAN_NET_PAYMENT_VOLUME
Use ⌥ + scroll to pan the table
1 column, 1 row
0.9 seconds (20 hours ago)
0
--Since there are 1,269 different clients (nces_district_id), the top 1% would be approximately the top 12 ranked clients by net_income generated in 2020.
0
(Normally, I would have done this in a temp table, but I did not have permissions to create one with this sample dataset. I also would have left 1% as a variable if I were doing this in R or created it as an adjustable value in a dashboard on Tableau using parameters/add action. However, since this was an exercise in SQL, I left it at 1%.)
0

Part 2: Data Analysis
0

Top Clients
0

Clients Categorized by Top 1%
Categorized clients (nces_district_id) as top 1% or not top 1%
NCES_DISTRICT_ID
NET_PAYMENT_VOLUME
RANK
IS_TOP_1_PERCENT
CLIENT_ADD_DATE
CLIENT_CANCEL_DATE
CITY
STATE
DISTRICT_TYPE
LOWEST_GRADE
HIGHEST_GRADE
LEVEL
NUMBER_OF_SCHOOLS
STATUS
CHARTER_DISTRICT
PAYMENT_DATE
PAYMENT_TYPE
PAYMENT_QTY
Use ⌥ + scroll to pan the table
18 columns, 10,000+ rows
3.8 seconds (19 hours ago)
0
client_transactions_2020_ranked
Untitled

NCES_DISTRICT_ID
NET_PAYMENT_VOLUME
RANK
IS_TOP_1_PERCENT
CLIENT_ADD_DATE
CLIENT_CANCEL_DATE
CITY
STATE
DISTRICT_TYPE
LOWEST_GRADE
HIGHEST_GRADE
LEVEL
NUMBER_OF_SCHOOLS
STATUS
CHARTER_DISTRICT
PAYMENT_DATE
PAYMENT_TYPE
PAYMENT_QTY
Use ⌥ + scroll to pan the table
18 columns, 10,000+ rows
4.1 seconds (19 hours ago)
0

What characteristics set the top 1% of clients apart from the rest?
0

State and Number of Districts
Untitled

STATE
IS_TOP_1_PERCENT
AVERAGE_VOLUME
NUM_DISTRICTS
Use ⌥ + scroll to pan the table
4 columns, 47 rows
1.8 seconds (19 hours ago)
0
  • Florida primarily uses FOCUS SIS which is currently not supported. Florida has the 3rd largest population, and might be a good target for developing future software to expand compatibility with school information systems. It might be worthwhile to find data on most common SIS software by state to increase marketability.
    0
State, City, Number of Districts
State, City, Number of Districts
STATE
CITY
IS_TOP_1_PERCENT
AVERAGE_VOLUME
NUM_DISTRICTS
Use ⌥ + scroll to pan the table
5 columns, 1,169 rows
1.3 seconds (4 hours ago)
0
Looking at the worst performing cities, Howell, MI is actually losing money. If we sort by number of districts (num_districts) descending, we see that Phoenix has 7 districts (highest number per city) as clients but makes significantly less money than the average district.
0
Note: I might want to go back and fix the capitalization for city names, although this shouldn't impact results since I didn't see any duplicates with all caps.
0
District Type Distribution For Top Performing Clients
Untitled

DISTRICT_TYPE
IS_TOP_1_PERCENT
AVERAGE_VOLUME
NUM_DISTRICTS
Use ⌥ + scroll to pan the table
4 columns, 7 rows
28.8 seconds (19 hours ago)
0
We can see from the table that regular public school districts that are not a component of a supervisory union are the most profitable. All of the top 1% of clients are this district type.
0
Number of Schools Per District For Top Performing Clients
Untitled

NCES_DISTRICT_ID
STUDENT_COUNT
NET_VOLUME_DISTRICT
IS_TOP_1_PERCENT
Use ⌥ + scroll to pan the table
4 columns, 1,269 rows
1.1 seconds (17 hours ago)
0
Note: 4 of the 12 districts in the top 1% are missing student_count.
0
Charter District Status for Top 1%
Untitled

CHARTER_DISTRICT
IS_TOP_1_PERCENT
AVERAGE_VOLUME
Use ⌥ + scroll to pan the table
3 columns, 3 rows
0.9 seconds (16 hours ago)
0
All of the top 1% are NOT charter schools.
0
Level of Top 1%
Untitled

LEVEL
IS_TOP_1_PERCENT
AVERAGE_VOLUME
Use ⌥ + scroll to pan the table
3 columns, 9 rows
2.2 seconds (16 hours ago)
0
All of the top 1% of values are high school or other (meaning multiple levels).
0
I was leery to use data from the Finance and Enrollment tables because both had extremely out-of-date information (2016-17 was the most recent school year).
0
Total_Revenue of Top 1%
Untitled

TOTAL_REVENUE
IS_TOP_1_PERCENT
AVERAGE_VOLUME
Use ⌥ + scroll to pan the table
3 columns, 1,156 rows
2.7 seconds (15 hours ago)
0
Total_Revenue of Top 1%
Use ⌥ + scroll to pan the visual
IS_TOP_1_PERCENT
FALSE
TRUE
3 columns, 1,156 rows
0
Total Expenses
Total Expenses
TOTAL_EXPENSES
IS_TOP_1_PERCENT
AVERAGE_VOLUME
Use ⌥ + scroll to pan the table
3 columns, 1,157 rows
3.3 seconds (15 hours ago)
0
Untitled

Use ⌥ + scroll to pan the visual
IS_TOP_1_PERCENT
FALSE
TRUE
3 columns, 1,157 rows
0
local_revenue
Local Revenue
LOCAL_REVENUE
IS_TOP_1_PERCENT
AVERAGE_VOLUME
Use ⌥ + scroll to pan the table
3 columns, 1,152 rows
2 seconds (15 hours ago)
0
Untitled

Use ⌥ + scroll to pan the visual
IS_TOP_1_PERCENT
FALSE
TRUE
3 columns, 1,152 rows
0
Local Vs Total Revenue
Untitled

LOCAL_REVENUE
TOTAL_REVENUE
IS_TOP_1_PERCENT
AVERAGE_VOLUME
Use ⌥ + scroll to pan the table
4 columns, 1,168 rows
3.2 seconds (9 hours ago)
0
Local Revenue Vs Total Revenue
Use ⌥ + scroll to pan the visual
IS_TOP_1_PERCENT
FALSE
TRUE
4 columns, 1,168 rows
0
District Net Income
District Net Income
NET_INCOME
IS_TOP_1_PERCENT
AVERAGE_VOLUME
Use ⌥ + scroll to pan the table
3 columns, 1,098 rows
2.1 seconds (15 hours ago)
0
Untitled

Use ⌥ + scroll to pan the visual
IS_TOP_1_PERCENT
FALSE
TRUE
3 columns, 1,098 rows
0
Number of Transactions In 2020
Untitled

AVERAGE_NUM_TRANSACTIONS
AVERAGE_VOLUME
IS_TOP_1_PERCENT
Use ⌥ + scroll to pan the table
3 columns, 2 rows
1.7 seconds (9 hours ago)
0
Untitled

Use ⌥ + scroll to pan the visual
IS_TOP_1_PERCENT
FALSE
TRUE
3 columns, 2 rows
0
Districts with more transactions are more likely to be in the top 1% of clients.
0

How can we identify which clients could become top clients in future years?
0

* Greater Number of Transactions
0

* Not Charter School
0

* Regular Public School that is Not a Component of a Supervisory Union
0

* Greater Number of Students (but less than 80,000)
0

* High School or Other (Multiple Grade Level Groups like K-12)
0

* Total Revenue Between $134,000,000 and $858,000,000
0


0
I wanted to search for the set of schools that were not current clients that had these characteristics, but the information was not available in this dataset. I would ask the data engineering team to add a table with charter_district, district_type, and level to the database. Alternatively, I could upload the dataset from SQL_TEST and the data publicly available via to Microsoft SQL Server. (I couldn't use databricks, because the files would likely be over my upload data limit for the free community edition.)
0

What characteristics may indicate that our clients will cancel?
0

Account Activation Table
Untitled

NCES_DISTRICT_ID
CLIENT_ADD_DATE
CLIENT_CANCEL_DATE
DAYS_DIFF
STATE
DISTRICT_TYPE
LOWEST_GRADE
HIGHEST_GRADE
LEVEL
NUMBER_OF_SCHOOLS
STATUS
CHARTER_DISTRICT
ACCOUNT_ACTIVATION
Use ⌥ + scroll to pan the table
13 columns, 1,825 rows
1.9 seconds (8 hours ago)
0
Cancelled By State
Cancelled BY State
STATE
NUM_ACTIVE_ACCOUNTS
NUM_CANCELLED_ACCOUNTS
RATIO_ACTIVE_TO_CANCELLED
Use ⌥ + scroll to pan the table
4 columns, 43 rows
0.7 seconds (5 hours ago)
0
Cancelled By District Type
Ordered From Most at Risk of Cancelling to Least at Risk of Cancelling
DISTRICT_TYPE
NUM_ACTIVE_ACCOUNTS
NUM_CANCELLED_ACCOUNTS
RATIO_ACTIVE_TO_CANCELLED
Use ⌥ + scroll to pan the table
4 columns, 8 rows
1.5 seconds (5 hours ago)
0
Cancelled By Level
Ordered From Most at Risk of Cancelling to Least at Risk of Cancelling
LEVEL
NUM_ACTIVE_ACCOUNTS
NUM_CANCELLED_ACCOUNTS
RATIO_ACTIVE_TO_CANCELLED
Use ⌥ + scroll to pan the table
4 columns, 7 rows
1.7 seconds (5 hours ago)
0
Cancelled By Number of Schools
Ordered From Most at Risk of Cancelling to Least at Risk of Cancelling
NUMBER_OF_SCHOOLS
NUM_ACTIVE_ACCOUNTS
NUM_CANCELLED_ACCOUNTS
RATIO_ACTIVE_TO_CANCELLED
Use ⌥ + scroll to pan the table
4 columns, 79 rows
1.4 seconds (5 hours ago)
0
Cancelled By Charter District (Charter of Non Charter
Ordered From Most at Risk of Cancelling to Least at Risk of Cancelling
CHARTER_DISTRICT
NUM_ACTIVE_ACCOUNTS
NUM_CANCELLED_ACCOUNTS
RATIO_ACTIVE_TO_CANCELLED
Use ⌥ + scroll to pan the table
4 columns, 2 rows
1.6 seconds (5 hours ago)
0
PART 5: CAVEATS AND FUTURE CONSIDERATIONS
0
Caveats:
0
  • Data from the Finances and Enrollment tables were out of date and also pre-pandemic.
    0
  • I tried to limit my analysis to data from after the pandemic began. As a former teacher, I have experienced first-hand the degree to which school needs, resources, and procedures have changed as a result of COVID-19.
    0
  • need for 1-1 devices may increase district likelihood to adopt this technology
    0
  • early/mid pandemic (2020-2021): many schools were not permitted to collect items (such as paper money) or use shared classroom supplies (like sharing computers in a lab setting)
    0
  • Most school districts prioritized updating technology and ensuring students of all ages had 1-1 devices for remote/hybrid learning. Additionally, most districts invested in increasing at-home internet access for low income students. Having internet at home and take-home laptops/ipads will likely make parents more likely to be able to use the app to pay electronically
    0
  • The teacher/substitute shortage increases class size and decreases teacher planning time. Additionally, the adoption of hybrid learning and the in general increased teacher responsibilities during the pandemic leads to more stress and less time for teachers. Since collecting fees from fundraising/school fees is typically the homeroom teacher's responsibility, removing this extra task from teachers' long list of jobs would decrease teachers' workloads.
    0
  • Missing/Incomplete Data
    0
Future Directions:
0
  • I would like to have updated the finance and enrollment tables with data from after the pandemic started.
    0
  • I didn't have time to use "case" statements to recategorize client grade Levels (other)
    0
  • I didn't have time to look at average number of transactions/time gap between most recent transaction and cancellation rate.
    0

0
Actions
Style
Document
Move up
Move down
Move to top
Move to bottom
Copy link
Duplicate
D
Insert above
A
Insert below
B
Run cell
Rename cell
R
Hide cell
H
Reference cell
Create visual
V
Expand
E
Download CSV
Copy SQL
Delete
?
By using this website you agree to our Cookie Policy.